loratory Average
نویسندگان
چکیده
We introduce a model-based average reward Reinforcement Learning method called H-learning and compare it with its discounted counterpart, Adaptive Real-Time Dynamic Programming, in a simulated robot scheduling task. We also introduce an extension to H-learning, which automatically explores the unexplored parts of the state space, while always choosing greedy actions with respect to the current value function. We show that this “Auto-exploratory H-learning” performs better than the original H-learning under previously studied exploration methods such as random, recency-based, or counter-based ex-
منابع مشابه
Technology in S-iritual 2ormation4 An 67-loratory Study of Com-uter Mediated Religious Communications
1,$78-1N>>+("#>(-:"& & ABSTRA&T In this paper, we report findings from a study of American Christian ministers’ uses of technologies in religious practices. We focus on the use of technologies for spiritual purposes as opposed to pragmatic and logistical, but report on all. We present results about the uses of technologies in three aspects of religious work: religious study and reflection, chur...
متن کاملImpulse Purchasing Behaviors of the Turkish Consumers in Websites as a Dynamic Consumer Model: Technology Products Example
This paper examines the concept of imp ulse purchasing be havior online basically . The phe nomenon of impulse purchasing has been researched in consumer research as well as for example in psychology and economics since the 1950s. A detailed review and anal ysis of the literature asserts that there are some unsolved issues regarding the state of know ledge on impulse p urchasing be havior. F ur...
متن کاملIncentives to Settle Under Joint and Several Liability: An Empirical Analysis of Superfund Litigation
Congress may soon restrict join t and several l iabi l i ty for c leanup of contaminated sites under Superfund. We explore whether this change would discourage settle ments and is therefore l ikely to increase the program ' s already high litigation costs per site. Recent theoretical research by Kornhauser and Revesz finds that joint and several l iab i l i ty may either encourage or discourag...
متن کاملInbreeding Effects on Average Daily Gains and Kleiber Ratios in Iranian Moghani Sheep
The objective of the present study was to evaluate the effects of inbreeding on average daily gains and Kleiber ratios in Moghani sheep. Traits included average daily gain from birth to 3 months (ADG1), average daily gain from birth to 6 months (ADG2), average daily gain from 3 months to 6 months (ADG3), average daily gain from 3 months to 9 months (ADG4), average daily gain from 3 months to ye...
متن کاملLIMIT AVERAGE SHADOWING AND DOMINATED SPLITTING
In this paper the notion of limit average shadowing property is introduced for diffeomorphisms on a compact smooth manifold M and a class of diffeomorphisms is given which has the limit average shadowing property, but does not have the shadowing property. Moreover, we prove that for a closed f-invariant set Lambda of a diffeomorphism f, if Lambda is C1-stably limit average shadowing and t...
متن کامل